18654: Rename boolean Interval constants to match NullableInterval#22
18654: Rename boolean Interval constants to match NullableInterval#22martin-augment wants to merge 1 commit intomainfrom
Interval constants to match NullableInterval#22Conversation
CERTAINLY_FALSE -> FALSE CERTAINLY_TRUE -> TRUE UNCERTAIN -> TRUE_OR_FALSE
WalkthroughThis pull request refactors boolean interval arithmetic constants across the DataFusion codebase. In ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Pull Request Review: Rename boolean
|
| .unwrap(); | ||
|
|
||
| assert_eq!(null_interval, Interval::UNCERTAIN); | ||
| assert_eq!(null_interval, Interval::TRUE_OR_FALSE); |
There was a problem hiding this comment.
A few lines below, the comment still says Interval::UNCERTAIN; consider updating it to Interval::TRUE_OR_FALSE to match the new naming (the code already uses TRUE_OR_FALSE).
🤖 Was this useful? React with 👍 or 👎
There was a problem hiding this comment.
value:good-to-have; category:documentation; feedback:The Augment AI reviewer is correct that one occurrence of Interval::UNCERTAIN was left unchanged in the docstrings. It does not cause any problems now but it would be confusing once the deprecated constants are removed in a later version
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
datafusion/physical-expr/src/intervals/cp_solver.rs (1)
647-654: OR with parent TRUE can surface as user-facing error — reachability confirmed, fallback recommendedThis code path IS reachable. The
update_rangesmethod is called withInterval::TRUEin production code:
symmetric_hash_join.rs:1243in join filter evaluationanalysis.rs:217in expression analysisIf a filter expression contains an OR operator, the solver's top-down propagation will hit this guard and emit
not_impl_err!. The underlyingBinaryExpr::propagate_constraintscan handle TRUE+OR (verified in test), so a fallback likeOk(Some(vec![]))(treat as no refinement) is feasible until multi-set propagation is implemented.Consider either:
- Switching to a non-fatal fallback to avoid blocking real queries
- Adding a test case that exercises this from
update_ranges(not just directpropagate_constraintscalls)
🧹 Nitpick comments (3)
datafusion/functions/src/math/monotonicity.rs (1)
34-38: Monotonicity domain checks consistently migratedAll in-domain validations now compare to Interval::TRUE, matching the new boolean-interval API. No semantic drift detected.
If there’s a helper like is_true(), consider it for brevity.
Also applies to: 75-80, 113-118, 210-215, 374-381, 501-506, 539-544, 577-582, 704-709
datafusion/physical-expr/src/expressions/binary.rs (1)
371-403: OR propagation aligns with rename; verify infeasibility/precision and add tests for new variantsParent FALSE case properly forces both children to FALSE and returns None when infeasible. For parent TRUE, targeted refinements when the other side is TRUE_OR_FALSE are good. As with AND, consider handling TRUE_OR_UNKNOWN/FALSE_OR_UNKNOWN/ANY_TRUTH_VALUE for better precision and add tests mirroring these cases.
I can add unit tests that cover the new boolean variants once we confirm their presence/semantics.
datafusion/physical-expr/src/intervals/cp_solver.rs (1)
1641-1677: Tests updated to the new constants — add cases for extended variantsGood coverage of TRUE/FALSE/TRUE_OR_FALSE in OR/AND propagation. Consider adding tests for TRUE_OR_UNKNOWN, FALSE_OR_UNKNOWN, and ANY_TRUTH_VALUE if these are now part of Interval to keep parity with NullableInterval.
I can submit the additional test vectors once we confirm the constructors/constants.
Also applies to: 1689-1713
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
datafusion/expr-common/src/interval_arithmetic.rs(31 hunks)datafusion/expr-common/src/statistics.rs(5 hunks)datafusion/functions/src/math/abs.rs(1 hunks)datafusion/functions/src/math/monotonicity.rs(9 hunks)datafusion/physical-expr-common/src/physical_expr.rs(2 hunks)datafusion/physical-expr/src/analysis.rs(1 hunks)datafusion/physical-expr/src/expressions/binary.rs(2 hunks)datafusion/physical-expr/src/expressions/not.rs(1 hunks)datafusion/physical-expr/src/intervals/cp_solver.rs(10 hunks)datafusion/physical-expr/src/statistics/stats_solver.rs(1 hunks)datafusion/physical-plan/src/joins/symmetric_hash_join.rs(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (4)
datafusion/physical-plan/src/joins/symmetric_hash_join.rs (1)
datafusion/physical-expr/src/intervals/cp_solver.rs (1)
graph(647-650)
datafusion/physical-expr-common/src/physical_expr.rs (1)
datafusion/common/src/scalar/mod.rs (1)
new_one(1567-1642)
datafusion/physical-expr/src/intervals/cp_solver.rs (1)
datafusion/physical-expr/src/expressions/binary.rs (10)
left(105-107)left(848-848)left(852-852)right(110-112)right(849-849)right(853-853)children(304-306)expr(132-132)expr(503-503)propagate_constraints(326-414)
datafusion/functions/src/math/abs.rs (1)
datafusion/expr-common/src/statistics.rs (6)
range(167-175)range(328-330)range(402-409)range(442-444)range(503-514)range(571-573)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Cursor Bugbot
- GitHub Check: claude-review
- GitHub Check: Analyze (rust)
🔇 Additional comments (11)
datafusion/physical-plan/src/joins/symmetric_hash_join.rs (1)
1243-1244: Rename aligns with pruning semanticsUsing Interval::TRUE for graph.update_ranges is correct and consistent with the new boolean interval naming. No behavior change beyond the rename.
datafusion/functions/src/math/abs.rs (1)
178-184: Ordering check correctly updatedComparing to Interval::TRUE preserves the original intent for abs monotonicity. Looks good.
datafusion/physical-expr/src/analysis.rs (1)
217-231: Propagation gate uses new truth constantupdate_ranges(..., Interval::TRUE) is consistent with the refactor; success/infeasible/cannot-propagate handling unchanged.
datafusion/physical-expr/src/statistics/stats_solver.rs (1)
156-166: Subset check correctly migratedUsing Interval::TRUE to confirm full containment before adopting given_stats maintains prior behavior with the new enum.
datafusion/physical-expr/src/expressions/not.rs (1)
156-176: NOT propagation updated to TRUE/FALSEBranching against Interval::TRUE and Interval::FALSE is correct with the new naming; behavior preserved.
datafusion/physical-expr-common/src/physical_expr.rs (1)
248-258: Boolean interval → Bernoulli mapping remains correctTRUE maps to p=1, FALSE to p=0, otherwise null probability. Propagation path retains the “must be certain” assumption.
Also applies to: 310-320
datafusion/expr-common/src/statistics.rs (2)
162-165: Bernoulli semantics and tests updated coherentlyRange and comparison handling now use FALSE/TRUE/TRUE_OR_FALSE; unit tests reflect this. All consistent.
Also applies to: 503-514, 721-729, 882-883, 995-996
1-20: Verification confirms proper constant migration—old names are deprecated, not lingeringThe old constants (
CERTAINLY_TRUE,CERTAINLY_FALSE,UNCERTAIN) are properly handled as deprecated aliases marked with#[deprecated(since = "52.0.0")]that redirect to the new names. References in comments are documentation only. The new constants (Interval::TRUE,Interval::FALSE,Interval::TRUE_OR_FALSE) are integrated throughout the codebase across multiple modules. No lingering issues found.datafusion/physical-expr/src/intervals/cp_solver.rs (2)
521-533: Bounds containment checks updated to boolean-interval API; semantics preservedgiven_range.contains(bounds)? == Interval::TRUE and bounds.contains(given_range)? != Interval::FALSE maintain the previous tri-state logic. No action needed.
348-383: All changes verified: constant rename applied correctly, no stale names in active codeVerification confirms no occurrences of old constant names (CERTAINLY_TRUE, CERTAINLY_FALSE, UNCERTAIN) in active code outside interval_arithmetic.rs—only as backward-compatibility aliases in definitions. The cp_solver.rs propagate_comparison function correctly uses the new names (Interval::TRUE, Interval::FALSE, Interval::TRUE_OR_FALSE) and handles all branches appropriately, with uncertain parents returning Ok(None) as expected. The logic preserves prior behavior.
datafusion/physical-expr/src/expressions/binary.rs (1)
336-368: Review comment is incorrect due to type mismatch; Interval and NullableInterval are separate typesThe code in binary.rs:336–368 uses the
Intervaltype, which defines only three boolean constants:TRUE,FALSE, andTRUE_OR_FALSE. The suggested constantsTRUE_OR_UNKNOWN,FALSE_OR_UNKNOWN, andANY_TRUTH_VALUEexist only on theNullableIntervaltype and are not available onInterval. Since the file does not import or useNullableInterval, the suggestion to add support for these constants is not applicable to this code.Likely an incorrect or invalid review comment.
18654: To review by AI